perm filename CAUSAL.TEX[W85,JMC] blob sn#795178 filedate 1985-05-30 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00011 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002	\input jmcmac.tex[let,jmc]
C00004 00003	\section{INTRODUCTION}
C00008 00004	\section{SITUATIONS AND FLUENTS}
C00014 00005	\section{CAUSALITY}
C00021 00006	\section{ACTIONS AND THE OPERATOR {\it can}}
C00026 00007	\section{EXAMPLES}
C00030 00008	2.   AN ENDGAME
C00033 00009	\section{NOTE}
C00036 00010	\section{REFERENCES}
C00037 00011	\vfil\eject\end
C00038 ENDMK
C⊗;
\input jmcmac.tex[let,jmc]
\title{SITUATIONS, ACTIONS, AND CAUSAL LAWS}
\bigskip\bigskip\bigskip
Abstract: A formal theory is given concerning situations, causality and the
possibility and effects of actions is given.The theory is intended to be used by 
the Advice Taker, a computer program that is to decide what to do by reasoning.
Some simple examples are given of descriptions of situations and dedutions that
certain goals can be achieved.
\vfill
The research reported here was supported in part by the Advanced Research 
Projects Agency of the office of the Secretary of Defense (SD--183).

\whentexed{causal.tex[w85,jmc]}
\eject
\section{INTRODUCTION}

	Altough formalized theories have been devised to express the most
important fields of mathematics and some progress has been made in formalizing
certain imperical sciences, there is at present no formal theory in which one
can express the kind of means-ends analysis used in ordinary life. The closest
approach to such a theory of which I am aware is made by Feudenthal in
{\it Lincos}[1].

	Our approach to the artificial intelligence problem requires a formal 
theory. Namely, we believe that human intelligence depends essentially on the
fact that we can represent in language facts about our situation, our goals, and
the effects of the various actions we can perform. Moreover, we can draw
conclusions from the facts to the effect certain sequences of actions are
likely to achieve our goals.

In {\it Programs with Common Sense}[2], I discussed the advantages of having
a computer program, to be called the Advice Taker that would reason from
collections of facts about its problem and derive statements about what it could
do. The name Advice Taker came from the hope that its behavior could be improved
by giving it advice in the form of new facts rather than by rewriting the program.
The reader is referred to that paper for further information about the Advice
Taker and to Minsky's paper {\it Steps Toward Artificial Intelligence}[3] for a
general introduction to the subject.

	The first requirement for the Advice Taker is a formal system in which
facts about situations, goals and actions can be expressed, and containing the
general facts about a means and end as as axioms. A start is made in this paper on
providing a system meeting the following specifications.

\itnum General properties of causality and certain obvious but until now
unformalized facts about the possibility and results of actions are given
as axioms.

\itnum It is a logical consquence of the facts of a situation and the
general axioms that certain persons can achieve certain goals
by taking certain actions.

\itnum The formal descriptions of situations should correspond as closely
as possible to what people may reasonably be presumed to know about them when
deciding what to do.
\section{SITUATIONS AND FLUENTS}
	One of the basic entites in our theory is the situation. Intuitively, 
a situation is the complete state of affairs at some instant of time. The laws
of motion of a system determine from a situation all future situations. Thus a
situation corresponds to the notion in physics of a point in phase space. 
In physics, laws are expressed in the form of differential equations which
give the complete motion of the point in phase space.

	Our system is not intended for the complete discription of situations
nor for the discription of complete laws of motion. Instead, we deal with
partial discriptions of situations and partial laws of motion. Moreover, the
emphasis is on the simple qualitative laws of everyday life rather than on
the quantitative laws of physics. As an example, take the fact that if it is
raining and I go outside I will get wet.

Since a situation is a complete state of affairs we can never describe a
situation completely, and therefore we provide no notation for doing so in our
theory. Instead, we state facts about situations in the language of an extended
predicate calculus. Examples of such facts are:
 
\numit $raining(s)$\hfil\break
meaning that it is raining in situation $s$

\numit $time(s)=1963.7205$\hfil\break
giving the value of the time in situation $s$. It will usually prove convenient 
to regard the time as a funtion of the situation rather than vice versa.
The reason for this is that the numerical value of the time is known and
important only where the laws of physics are being used.

\numit $at(I,home,s) or at(I,home)(s)$\hfil\break
meaning that I am at home in situation $s$. We prefer and will use the second of 
the given notations that isolates the situation variable since in most if not
all cases we will be able to suppress it completely.

	We shall not describe in this memorandum the logical system we intend
to use. Basically it is a predicate calculus, but we shall use the notation
and if necessary conditional expressions as in LISP or ALGOL. We shall extend
the meaning of the Boolean operators to operate on predicates. Thus by
$$at(I, home)∧raining$$
we mean the same as
$$λs. at(I, home.)(s)∧raining(s)$$

	A predicate or function whose argument is a situation will be called a
{\it fluent}, the former being called a {\it propositional fluent}. Thus,
{\it Raining, time,and at(home)} are all fluents, the first and last being
prepositional fluents.

The term was used by Newton for a physical quantity that depends on time, and 
according to my limited understanding of what he meant, the present use 
of the term is justified.

	In our formulas we will usually manage to use the fluents without
explicitly writing variables representing situations. This corresponds to
the use of random variables in probability theory without using variables
representing points in the sample space even though random variables are
supposed to be regarded as functions defined on a sample space.

	In fact we shall go further and give an interpretation of our
theory as a sort of modal logic in which the fluents are not regarded as
functions at all.
\section{CAUSALITY}
	In order to express causal laws we introduce the second order predicate
cause. The statement
%
$$cause(π)(s)$$
%
where $π$ is a propositional fluent is intended to mean that the situation s
will lead in the future  to a situation that satisfies the fluent $π$. Thus,
$cause(π)$ is itself a propositional fluent. As an example of its use we write
%
$$∀s.∀p.[person(p)∧raining∧outside(p)⊃cause(wet(p))](s)$$
%
which asserts that a person who is outside when it is raining will get wet.
We shall make the convention that if $π$ is a fluent then
%
$$∀π$$
%
means the same as
%
$$∀s.π(s)$$

With this convention we can write the previous statement as
%
$$∀∀p.person(p)∧raining∧outside(p)⊃cause(wet(p))$$
%
which suppresses explicit mention of situations. As a second example we give a
special case of the law of falling bodies in the form:

$$\vcenter{\openup\jot\ialign{$#$\hfil\cr\mskip-\thickmuskip
∀∀t.∀b.∀t↑\prime .∀h(real(t)∧real(t↑\prime )∧real(h)∧body(b)\cr
∧ unsupported(b)∧[height(b)=h]∧[1/2 gt↑2<h]\cr
∧[time=t↑\prime ]⊃cause(height(b)=h[1/2 gt↑2∧time=t↑\prime =t)\cr}}$$

	The concept of causality is intended to satisfy the two folllowing
general laws, which may be taken as axioms:
$$∀. cause(π)∧[∀.π⊃p]⊃cause(π∀l∨π↓2)\leqno(C1)$$
$$∀. cause(cause(π))⊃cause(π)\leqno(C2)$$
$$∀. cause(π∀l)∨cause(π↓2)⊃cause(π∀l∨π↓2)\leqno(C3)$$

The fact that we can supress explicit mention of situations has the following
interesting consequence. Instead of regarding the $π$'s as predicates we may
regard them as prepositions and regard {\it cause} as a new modal operator.
The operator $∀$ seems to be equivalent to the N (necessary) operator of
ordinary modal logic.

Conversely, it would appear that modal logic of necessity might be regarded
as a monadic predicate calculus where all quantifiers are over situations.

	In the present case of causality, we seem to have our choice of how to
proceed. Regarding the system as a modal logic seems to have the following
two advantages.

\itemcount=0
\itnum If we use the predicate calculus interpretation we require second
order predicate calculus in order to handle $cause(π)(s)$, while if we 
take the modal interpretation we can get by with first order predicate calculus.

\itnum We shall want decision procedures or at least proof procedures 
for as much of our system as possible. If we use the modal approach many 
problems will involve only substituion of constants for variables in universal 
statements and will therefore fall into a fairly readily decidable domain.

	Another example of causality is given by a 2-bit binary counter
that counts every second. In our formalism its behavior may be described by
the statement:
%
$$\vcenter{\openup\jot\ialign{$#$\hfil\cr\mskip-\thickmuskip
∀∀t ∀x↓o ∀x↓1 /time=t∧bit∞=x↓o ∧bit1 x↓1\cr
⊃ cause(time=t+1∧(bit 0=x↓0 \oplus 1)∧(bit 1=x↓1 \oplus (x↓0∧1)))\cr}}$$
%
In this example {\it time},$bit ∞$ $bit II$ are fluents while $t$, $x↓0$ and
$x↓1$ are numerical variables. The distinction is made clearer if we use the
more long-winded statement
%
$$\vcenter{\openup\jot\ialign{$#$\hfil\cr\mskip-\thickmuskip
∀s∀t∀x↓0∀x↓1.time(s) = t∧ bit 0(s) = x↓0∧ bit1(s) = x↓1\cr
⊃cause(λs↑\prime .time(s↑\prime ) = t+1∧(bit 0(s↑\prime ) = x↓0\oplus 1)∧
(bit1(s↑\prime) = x↓1\oplus (x↓0∧1)))(s)\cr}}$$
%
In this case however we can rewrite the statement in the form
%
$$\vcenter{\openup\jot\ialign{$#$\hfil\cr\mskip-\thickmuskip
∀s.cause(λs↑\prime .[time(s↑\prime ) = time(s)+1]∧[bit0(s↑\prime)= bit0(s)\oplus 1]\cr
∧[bit1(s↑\prime) = bit1(s) \oplus (bit 0(s)∧1)])(s)\cr}}$$
%
Thus we see that the suppression of explicit mention of the situations forced us
to introduce the auxiliary quantities $t$, $x∀o$ and $x∀l$ which are required because
we can no longer use functions of two different situations in the same formula.
Nevertheless, the $s$-suppressed form may still be worthwhile since it admits the
modal interpretation.

	The time as a fluent satisfies certain axioms. The fact that there is 
only one situation corresponding to a given value of the time may be expressed
by the axiom.
$$∀∀π∀p∀t. cause(time=t∧π)∧ cause(time= t∧p)⊃cause(time=t∧π∧p)\leqno(T1)$$
Another axiom is
$$∀∀t.real(t)∧t>time⊃cause(time=t).\leqno(T2)$$
\section{ACTIONS AND THE OPERATOR {\it can}}
	
	We shall regard the fact that a person performs a certain action in
a situation as a propositional fluent. Thus
$$moves(person, object, location)(s)$$
is regarded as asserting that {\it person moves object} to {\it location}
in the situation $s$ .The effect of moving something is described by
$$∀∀p ∀o ∀l. moves (p,o,l)⊃ cause [at∀o∀l))$$
or in long form
$$∀s ∀p ∀o ∀l.moves (p∀o∀l)(s)⊃cause(λs↑\prime .at∀o∀l)(s↑\prime))(s)$$

In order to discuss the ability of persons to achieve goals and to perform
actions we introduce the operator {\it can}.
$$can(p,π)(s)$$
asserts that the person $p$ can make the situation $s$ satisfy. We see that
$can(p,π)$ is a prepositional fluent and that like {\it cause}, {\it can}
may be regarded either as a second order predicate or a modal operator. Our
most common use of {\it can} will be to assert that a person can perform 
a certain  action. Thus we write
$$can(p,moves (p,o,l))(s)$$
to assert that in situation $s$, the person $p$ can move object $o$ to location
$l$.

	The operator {\it can} satisfies the axioms
$$∀∀π ∀\rho∀p.[can(p,π)∧π⊃\rho)⊃can(p,\rho)\leqno(K1)$$
$$∀∀π ∀p↓1 ∀p↓2.[\sim can(p↓1,π)∧can(p↓1,\sim π)]\leqno(K2)$$
$$∀∀p ∀π∀\rho [can(p,π)∧can(p,\rho)⊃can(p,π ∨ \rho )].\leqno(K3)$$

Using K1 and
$$can(p,moves(p,o,l))$$
and
$$∀∀\rho ∀o ∀l. moves(p,o,l)⊃cause(at(o,l))$$
we can deduce
$$can(p,cause(at(o,l)))$$
which shows that the operators {\it can} and{\it cause} often show up in
the same formula.

	The ability of people to perform joint actions can be expressed
by formulas like
$$can(p↓1,can(p↓2,marry(p↓1.p↓2)))$$
which suggests the commutative axiom
$$∀∀p↓1 ∀p↓2 ∀π. can(p↓1,can(p↓2,π))⊃can(p↓2,can(p↓1,π))\leqno(K4)$$

A kind of transitivity is expressed by the following:\hfil\break
\bigskip
{\bf Theorem}---from\hfil\break
1)\quad $can(p,cause(π))$
 
\noindent and\hfil\break
2)\quad $∀.π⊃can(p,cause(\rho))$
 
\noindent it follows that\hfil\break
3)\quad $can(p,cause(can(p,cause(\rho))))$
\bigskip
\item {\bf Proof}---Substitute $can(p,cause(\rho))$ for $\rho$ in axiom
C1 and substitute $cause(π)$ for $π$ and $cause(can(p,cause(\rho)))$ for 
$\rho$ in axiom K1. The conclusion then follows by propositional calculus.
\bigskip

	In order to discuss the achievement of goals requiring several 
consecutive actions we introduce $canult(p,π)$ which is intended to mean that
the person $p$ can ultimately bring about a situation satisfying $π$. We
connect it with {\it can} and {\it cause} by means of the axiom
$$∀. ∀p ∀π.π∨ can(p,cause(canult(p,π)))⊃canult(p,π)\leqno(KC1)$$

	This axiom partially corresponds to the LISP-type recursive definition
$$canult(p,π)=π∨can(p,cause(canult(p,π))).$$

We also want the axiom
$$∀∀p∀π. cause(canult(p,π))⊃canult(p,π).\leqno(KC2)$$
\section{EXAMPLES}
1) The monkey can get the bananas

	The first example we shall consider is a situation in which
a monkey is in a room where a bunch of bananas is hanging from the 
ceiling too high to reach. In the corner of the room is a box, and
the solution to the monkey's problem is to move the box under the bananas 
and climb onto the box from which the bananas can be reached.

	We want to describe the situation in such a way that it follows from our
axioms and the description that the monkey can get the bananas. In this
memorandum we shall not discuss the heuristic problem of how monkey's
do or even might solve the problem. Specifically, we shall prove that
$$canult(monkey,has(monkey,bananas)).$$
The situation is described in a very oversimplified way by the following
seven statements:
$$∀∀u. place(u)⊃can(monkey,move(monkey,box)))\leqno(H1)$$
$$∀∀u ∀v ∀p move(p,v,u)⊃cause(at(v,u))\leqno(H2)$$
$$∀ can(monkey,climbs(monkey,box))\leqno(H3)$$
$$∀ ∀u∀v∀p. at(v,u)∧climbs(p,v)⊃cause(at(v,u)∧on(p,v))\leqno(H4)$$
$$∀ place(under(bananas))\leqno(H5)$$
$$\vcenter{\openup\jot\ialign{$#$\hfil\cr\mskip-\thickmuskip
∀ at(box,under(bananas))∧on(monkey,box)\cr
⊃can(monkey,reach(monkey,bananas))\cr}}\leqno(H6)$$
$$∀ ∀p ∀x. reach(p,x)⊃cause(has(p,x))\leqno(H7)$$
The reasoning proceeds as follows:\hfil\break
 From 1 and 5 by substitution of $under
(bananas)$for $u$ and PC (propositional calculus) we get
$$can(monkey,move(box,under(bananas))).\leqno(1)$$
Using 1) and H2 and axiom C1, we get
$$can(monkey,cause(at(box,under(bananas))))\leqno(2)$$
Similarly H3 and H4 and C1, give
$$\vcenter{\openup\jot\ialign{$#$\hfil\cr\mskip-\thickmuskip
at(box under(bananas)\cr
⊃can(monkey,cause(at(box,under(bananas))∧on(monkey,box))\cr}}\leqno(3)$$
Then H6 and H7 give
$$\vcenter{\openup\jot\ialign{$#$\hfil\cr\mskip-\thickmuskip
at(box,under(bananas))∧on(monkey,box)\cr
⊃can(monkey,cause(has(monkey,bananas)))\cr}}\leqno(4)$$
Now, theorem 1 is used to combine2) 3) and 4) to get
$$\vcenter{\openup\jot\ialign{$#$\hfil\cr\mskip-\thickmuskip
can(monkey,cause(can(monkey,\cr
cause(can(monkey,cause(has(monkey,bananas))))))\cr}}\leqno(5)$$

Using KC1, we reduce this to
$$canult(monkey,has(monkey,bananas)))$$
2.   AN ENDGAME

	A simple situation in a two player game can arise where player $p↓1$
has two moves, but whichever he chooses player $p↓2$ has a move that will beat
him. This situation may be described as follows:
\itemcount=0
\itnum   $can(p↓1,m↓1)∧can(p↓1,m↓2)∧m↓1∨m↓2)$
\itnum   $[m↓1⊃cause(π↓1)]∧[m↓2⊃cause(π↓2)]$
\itnum   $∀.π↓1∨π↓2⊃[can(p↓2,n↓1)∧can(p↓2,n↓2)∧(n↓1∨n↓2)]$
\itnum   $∀.(π↓1∧n↓1)∨(π↓2∧n↓2)⊃cause(win(p↓2))$

We would like to be able to draw the conclusion

\itnum   $canult(p↓2,win(p↓2))$

We proceed as follows: From 1) and 2) we get

\itnum   $cause(π↓1)∨cause(π↓2)$

and we use axiom C3 to get

\itnum   $cause(π↓1∨π↓2)$

Next we weaken 3) to get

\itnum   $∀.π↓1⊃can(p↓2,n↓1)$ and
\itnum   $∀.π↓2⊃can(p↓2,n↓2)$

and then we use K1 to get
\itnum  $∀.π↓1⊃can(p↓2,π↓1∧n↓1)$
\itnum  $∀.π↓2⊃can(p↓2,π↓2∧n↓2)$

The propositional calculus gives
\itnum  $∀.π↓∨π↓2⊃can(p↓2,π↓1∧n↓1)can(p↓2,π↓2∧n↓2)$

and using K3 we get
\itnum  $∀.π↓1∨π↓2⊃can(p↓2,π↓1∧n↓1)∨[π↓2∧n↓2))$

which together with 4) and K1 gives
\itnum  $∀.π↓1∨π↓2⊃can(p↓2cause(win(p↓2)))$

which together with 5) and C1 gives
\itnum  $cause(can(p↓2,cause(win(p↓2)))$

Using the axioms for{\it canult} we now get
\itnum  $canult(p↓2,win(p↓2))$
\section{NOTE}

	After finishing the bulk of this memorandum I came across
{\it The Syntex of Time Distributions} [4] by A.N. Prior. Prior
defines modal opertators $P$ and $F$ where
$$P(π) {\it means `it has been the case that} π' {\it and,}$$
$$F(π) {\it means `it will be the case that} π'.$$

	He subjects these operators to a number of axioms and rules of
inference in close analogy to the well-known[5] modal logic of possibility. He 
also interprets this logic in a restricted predicate calculus where the 
variables range over times. He then extends his logic to include a 
somewhat undetermined future and claims(unconvincingly) that this logic cannot
be interpreted in predicate calculus.

	I have not yet mad a detailed comparison of our logic with Prior's,
but here are some tentative conclusions.
\itemcount=0
\itnum The causality logic should be extended to allow inference about the past.
\itnum Causality logic should be extended to allow inference that certain 
propositional fluents wil always hold.
\itnum $cause(π)$ satisfies the axioms for his $F(π)$ which means that his 
futurity theory possesses, from his point of view, non-standard models. Namely, 
a collection of functions $p↓1(t),p↓2(t)$ may satisfy his futurity axioms and 
assign truth to $p(1)∧\sim (Fp)(o)$. In our system this is O.K. because 
something can happen withoutbeing caused to happen.
\itnum If we combine his past and futurity axioms, our system will no
longer fit his axioms and
$$p⊃\sim F(\sim P(p))\leqno(PF1)$$
$$p⊃\sim P(\sim F(p))\leqno(PF2)$$
since we do not wish to say that whatever is, was always inevitable.
\section{REFERENCES}

\itemcount=0
\itnum {\bf Freudenthal, H.A.(1960)}, ``LINCOS, Design of a language
for Cosmic Intercourse Part ',''Amsterdam.
\itnum {\bf McCarthy,J.(1959)},{\it Programs with Common Sense} Proceedings
Symposium on Mechanization of Thought Processes, Her Majesty's Stationary 
Office, London ,England.
\itnum {\bf Minsky, M.L.(1961)},``Steps Toward Artificial Intelligence''
,{\bf  Proceedings of the IRE}, Vol. 29, No 1,January 1961.
\itnum {\bf Prior,A.N.,(1958)},{\it The syntax of Time distinctions}, Franciscan 
Studies.
\itnum {\bf von Wright,G.H.(1951)}, {\it An essay in Modal Logic},Amsterdam.
\vfil\eject\end